CSE 255 Assignment 9

نویسندگان

  • Alexander Asplund
  • William Fedus
چکیده

In this paper we train a logistic regression function for two forms of link prediction among a set of 244 suspected terrorists in a social network. We train and test on a dataset created at the University of Maryland and further modified at UCSD by Eric Doi and Ke Tang [2]. The supposed terrorists have several labels for the nature of their links to other supposed terrorists; terrorists are classified as either colleagues, family, contacts, or congregates. Structural information about the known network connectivity of the supposed terrorists is integrated with additional binary information provided about the individuals to arrive at two final models. The first model predicts the existence of any type of link between two individuals and the second model classifies whether an existing link is ’colleague’ or ’other’. In the link prediction task, our final logistic regression, with per-example cost of 117, generates an average AUC metric 0.93 and on the second link classification task, the final linear logistic regression, with per-example regularization of 33.7, generates a 0.92 0/1 accuracy metric.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CSE 255: Assignment 1 - Exploring Musical Tagging

We explore two predictive tasks: (i) a measure of tag probability, and (ii) identifying a minimum tag set for more meaningful music classification on a 100,000 song dataset joined across complementary databases from the 1 Million Song Dataset (“MSD”). We conclude that a tag set size of around 50 tags is most meaningful and report many of our findings/analysis based on the top 50 tags. Using lin...

متن کامل

CSE 255 Assignment 2 Cuisine Prediction/Classification based on ingredients

In this paper, we consider different strategies for identifying the cuisine, given its ingredients. This project aims to explore what combination of ingredients is helpful in identifying a cuisine if the recipe is not given. This has been tackled as a problem of cuisine classification. We also explore different classification algorithms in tandem with approaches like taking combination of multi...

متن کامل

CSE 255 Assignment 1: Helpfulness in Amazon Reviews

In this paper we consider models for predicting the helpfulness rating of Amazon book reviews. We examine features such as the review’s star rating, the length of the review text, the readability of the review text, and the amount of comparisons made in the review. We compare Support Vector Machine and Random Forests models both for regression and classification.

متن کامل

CSE 255 Assignment 2 : Upvotes Prediction for Reddit Submissions

In this paper we consider models for predicting the number of upvotes on a reddit submission. We examine features such as the number of votes, number of comments, time of submission, upvote history of users, images, and subreddits of the submission. We compare Support Vector Regression, Linear Regression, and Gradient Boosting Regression models for predicting the number of upvotes.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015